Differentially Private Model Selection via Stability Arguments and the Robustness of the Lasso

نویسندگان

  • Adam Smith
  • Abhradeep Thakurta
  • SMITH THAKURTA
چکیده

We design differentially private algorithms for statistical model selection. Given a data set and a large, discrete collection of “models”, each of which is a family of probability distributions, the goal is to determine the model that best “fits” the data. This is a basic problem in many areas of statistics and machine learning. We consider settings in which there is a well-defined answer, in the following sense: Suppose that there is a nonprivate model selection procedure f which is the reference to which we compare our performance. Our differentially private algorithms output the correct value f(D) whenever f is stable on the input data set D. We work with two notions, perturbation stability and subsampling stability. We give two classes of results: generic ones, that apply to any function with discrete output set; and specific algorithms for the problem of sparse linear regression. The algorithms we describe are efficient and in some cases match the optimal nonprivate asymptotic sample complexity. Our algorithms for sparse linear regression require analyzing the stability properties of the popular LASSO estimator. We give sufficient conditions for the LASSO estimator to be robust to small changes in the data set, and show that these conditions hold with high probability under essentially the same stochastic assumptions that are used in the literature to analyze convergence of the LASSO.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Differentially Private Feature Selection via Stability Arguments, and the Robustness of the Lasso

We design differentially private algorithms for statistical model selection. Given a data set and alarge, discrete collection of “models”, each of which is a family of probability distributions, the goal isto determine the model that best “fits” the data. This is a basic problem in many areas of statistics andmachine learning.We consider settings in which there is a well-defined...

متن کامل

Differenced-Based Double Shrinking in Partial Linear Models

Partial linear model is very flexible when the relation between the covariates and responses, either parametric and nonparametric. However, estimation of the regression coefficients is challenging since one must also estimate the nonparametric component simultaneously. As a remedy, the differencing approach, to eliminate the nonparametric component and estimate the regression coefficients, can ...

متن کامل

Differentially Private Local Electricity Markets

Privacy-preserving electricity markets have a key role in steering customers towards participation in local electricity markets by guarantying to protect their sensitive information. Moreover, these markets make it possible to statically release and share the market outputs for social good. This paper aims to design a market for local energy communities by implementing Differential Privacy (DP)...

متن کامل

Nearly Optimal Private LASSO

We present a nearly optimal differentially private version of the well known LASSO estimator. Our algorithm provides privacy protection with respect to each training example. The excess risk of our algorithm, compared to the non-private version, is Õ(1/n), assuming all the input data has bounded `∞ norm. This is the first differentially private algorithm that achieves such a bound without the p...

متن کامل

Dynamics of a Delayed Epidemic Model with Beddington-DeAngelis ‎Incidence Rate and a Constant Infectious Period

In this paper, an SIR epidemic model with an infectious period and a non-linear Beddington-DeAngelis type incidence rate function is considered. The dynamics of this model depend on the reproduction number R0. Accurately, if R0 < 1, we show the global asymptotic stability of the disease-free equilibrium by analyzing the corresponding characteristic equation and using compa...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013